Vlsi Implementation of Integer Dct Architectures for Hevc in Fpga Technology
نویسندگان
چکیده
High Efficiency Video Coding (HEVC) inverse transform for residual coding uses 2-D 4x4 to 32x32 transforms with higher precision as compared to H.264/AVC’s 4x4 and 8x8 transforms resulting in an increased hardware complexity. In this paper, an energy and area-efficient VLSI architecture of an HEVC-compliant inverse transform and dequantization engine is presented. We implement a pipelining scheme to process all transform sizes at a minimum throughput of 2 pixel/cycle with zero-column skipping for improved throughput. We use data-gating in the 1-D Inverse Discrete Cosine Transform engine to improve energy-efficiency for smaller transform sizes. A high-density SRAM-based transpose memory is used for an area-efficient design. This design supports decoding of 4K Ultra-HD (3840x2160) video at 30 frame/sec. The inverse transform engine takes 98.1 kgate logic, 16.4 kbit SRAM and 10.82 pJ/pixel while the dequantization engine takes 27.7 kgate logic, 8.2 kbit SRAM and 1.10 pJ/pixel in 40 nm CMOS technology. Although larger transforms require more computation per coefficient, they typically contain a smaller proportion of non-zero coefficients. Due to this trade-off, larger transforms can be more energy-efficient. KeywordsHEVC, Inverse Discrete Cosine Transform, Transpose Memory, Data Gating.
منابع مشابه
FPGA implementation of Integer DCT for HEVC
-In this paper, area-efficient architectures for the implementation of integer discrete cosine transform (DCT) of different lengths to be used in High Efficiency Video Coding (HEVC) are proposed. An efficient constant matrix multiplication scheme can be used to derive parallel architectures for 1-D integer DCT of different lengths such as 4, 8, 16, and 32. Also power-efficient structures for fo...
متن کاملAn Improved Image Compressor Using Fast DCT Algorithm
Video processing systems such as HEVC requiring low energy consumption needed for the multimedia market has lead to extensive development in fast algorithms for the efficient approximation of 2-D DCT transforms. The DCT is employed in a multitude of compression standards due to its remarkable energy compaction properties. Multiplier-free approximate DCT transforms have been proposed that offer ...
متن کاملLow-Cost and High-Throughput Hardware Design for the HEVC 16x16 2-D DCT Transform
This article presents the hardware design of the 16x16 2-D DCT used in the new video coding standard, the HEVC – High Efficiency Video Coding. The transforms stage is one of the innovations proposed by HEVC, since a variable size transforms stage is available (from 4x4 to 32x32), allowing the use of transforms with larger dimensions than used in previous standards. The presented design explores...
متن کاملImplementation of Reusable Dct Architectures
In this paper, we present areaand powerefficient architectures for the implementation of integer discrete cosine transform (DCT) of different lengths to be used in High Efficiency Video Coding (HEVC).We show that an efficient constant matrix multiplication scheme can be used to derive parallel architectures for 1D integer DCT of different lengths. We also show that the proposed structure could ...
متن کاملEnergy-efficient 8-point DCT Approximations: Theory and Hardware Architectures
Due to its remarkable energy compaction properties, the discrete cosine transform (DCT) is employed in a multitude of compression standards, such as JPEG and H.265/HEVC. Several low-complexity integer approximations for the DCT have been proposed for both 1-D and 2-D signal analysis. The increasing demand for low-complexity, energy efficient methods require algorithms with even lower computatio...
متن کامل